Skip to content

Conversation

@mhauru
Copy link
Member

@mhauru mhauru commented Nov 5, 2025

Replicating the changes made to Mooncake by in chalk-lab/Mooncake.jl#832.

The benchmarks that pass, run on this branch:

benchmarking rosenbrock...
  Run Original Function:  185.542 μs (24 allocations: 6.25 MiB)
  Run TapedTask: #produce=1;   6.784 ms (299145 allocations: 10.83 MiB)
benchmarking ackley...
  Run Original Function:  1.154 ms (0 allocations: 0 bytes)
  Run TapedTask: #produce=100000;   60.412 ms (899584 allocations: 21.36 MiB)
benchmarking matrix_test...
  Run Original Function:  97.417 μs (18 allocations: 576.47 KiB)
  Run TapedTask: #produce=1;   468.625 μs (530 allocations: 594.08 KiB)
benchmarking neural_net...
  Run Original Function:  445.707 ns (8 allocations: 576 bytes)
  Run TapedTask: #produce=1;   12.625 μs (168 allocations: 6.27 KiB)

on main:

benchmarking rosenbrock...
  Run Original Function:  169.583 μs (24 allocations: 6.25 MiB)
  Run TapedTask: #produce=1;   2.062 ms (679 allocations: 6.27 MiB)
benchmarking ackley...
  Run Original Function:  1.154 ms (0 allocations: 0 bytes)
  Run TapedTask: #produce=100000;   56.672 ms (699582 allocations: 18.31 MiB)
benchmarking matrix_test...
  Run Original Function:  98.250 μs (18 allocations: 576.47 KiB)
  Run TapedTask: #produce=1;   294.750 μs (529 allocations: 594.06 KiB)
benchmarking neural_net...
  Run Original Function:  430.065 ns (8 allocations: 576 bytes)
  Run TapedTask: #produce=1;   12.041 μs (168 allocations: 6.27 KiB)

and on main but skipping the manual optimisation and just calling misty_closure directly:

benchmarking rosenbrock...
  Run Original Function:  173.083 μs (24 allocations: 6.25 MiB)
  Run TapedTask: #produce=1;   6.428 ms (299145 allocations: 10.83 MiB)
benchmarking ackley...
  Run Original Function:  1.153 ms (0 allocations: 0 bytes)
  Run TapedTask: #produce=100000;   61.659 ms (899584 allocations: 21.36 MiB)
benchmarking matrix_test...
  Run Original Function:  98.542 μs (18 allocations: 576.47 KiB)
  Run TapedTask: #produce=1;   389.250 μs (530 allocations: 594.08 KiB)
benchmarking neural_net...
  Run Original Function:  435.187 ns (8 allocations: 576 bytes)
  Run TapedTask: #produce=1;   11.791 μs (168 allocations: 6.27 KiB)
done

Two of these are close, but the Rosenbrock one is 230% slower and matrix_test about 60% slower on this branch, and very similar to if I just disable the manual optimisation.

@github-actions
Copy link

github-actions bot commented Nov 5, 2025

Libtask.jl documentation for PR #205 is available at:
https://TuringLang.github.io/Libtask.jl/previews/PR205/

@mhauru
Copy link
Member Author

mhauru commented Nov 5, 2025

After some optimisations provided by @Technici4n, the benchmarks on this branch are now on par with main:

benchmarking rosenbrock...
  Run Original Function:  193.709 μs (24 allocations: 6.25 MiB)
  Run TapedTask: #produce=1;   1.871 ms (678 allocations: 6.27 MiB)
benchmarking ackley...
  Run Original Function:  1.159 ms (0 allocations: 0 bytes)
  Run TapedTask: #produce=100000;   53.375 ms (500092 allocations: 15.26 MiB)
benchmarking matrix_test...
  Run Original Function:  99.625 μs (18 allocations: 576.47 KiB)
  Run TapedTask: #produce=1;   295.333 μs (528 allocations: 594.05 KiB)
benchmarking neural_net...
  Run Original Function:  439.394 ns (8 allocations: 576 bytes)
  Run TapedTask: #produce=1;   11.916 μs (168 allocations: 6.27 KiB)

That makes this ready for review.

@mhauru mhauru marked this pull request as ready for review November 5, 2025 13:18
@mhauru mhauru requested a review from sunxd3 November 5, 2025 13:22
Copy link
Member

@sunxd3 sunxd3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks clean

@sunxd3
Copy link
Member

sunxd3 commented Nov 5, 2025

great work, both!

@mhauru
Copy link
Member Author

mhauru commented Nov 6, 2025

This really all @Technici4n, I just copied over some of his code from Mooncake/Slack and added comments. Thank you!

@mhauru mhauru merged commit a3e259e into main Nov 6, 2025
11 of 14 checks passed
@mhauru mhauru deleted the mhauru/1.12-valid-worlds branch November 6, 2025 10:26
@penelopeysm
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants